I've heard often that deploying a traditional monolithic Rails app (i.e. no internal Web API, no message queue, no Redis/memcached server) to multiple servers can produce a bunch of bugs that are very hard to debug but I'm having a hard time coming up with some concrete examples despite a few hours of googling
Some obvious issues that I can think of are:
Observers - likely will not work properly as the observation is only propagated on one server and not all of them (assuming there is no Message Queue)
Sessions - would probably need to store these in the database which would need it's own host
Caches - any sweepers would have issues propagating invalidations between servers.
Anyone else care to contribute? I'd really appreciate any articles others may have come across or just general wisdom :)
Observers are just code callbacks.
They run on each process, on each server.
Sessions have defaulted to the cookie store for the last few years.
So multiple servers are no problem.
If you don't have enough space in your cookie then I suggest you may be doing something wrong.
Cache invalidation is indeed a problem.
But it always is.
One solution is to break your cache out into a standalone service.
Sites like Facebook have giant farms of memcache
I think scaling and clustering is always a hard problem.
But this seems to be an old argument against rails.
If anything the last few years have seen rails shine in this respect.
With ec2, nosql, and server automation becoming quite a norm in the community.
Related
Until now, our site has had a modest amount of traffic. None of our developers are big ops guys, but we've stayed ahead of it and keep the site up and running pretty quick. That said, our dev team is stretched, we've accumulated some technical debt, and there's plenty of opportunity to optimize.
Without getting into specifics, we just found out that we'll be expecting a massive amount of traffic in the near future in a very short period time. On the order of several million hits in a few hours. Scaling is one thing, but this is several orders of magnitude greater than what we're seeing now.
We're a Rails app hosted on S3 using ELB, and Postgresql.
I wanted to field some recommendations for broad starting points for scaling and load testing given this situation.
Update: Sorry, EC2, late night :)
#LastZactionHero
Pretty interesting question, let me answer you in detail, I hope you are talking about some e-commerce applications, enterprise or B2B apps doenst see spikes as such. Since you already mentioned that you are hosted your rails app on s3. Let me make couple of things clear.
1)You cant host an rails app on s3. S3 is simple storage service. Where you can only store files.
2) I guess you have hosted your rails app on AWS ec2 with a elastic load balancer attached above the ec2 instances which is pretty good.
3)You have a self managed Postgresql deployed on a ec2 instance.
If you are running on AWS you are half way safe and you can easily scale up and scale down.
I can see one problem in your present model, that your db. AWS has got db as a service. Thats called Relation database service.Which supports Mysql Oracle and MS SQL server.
RDS comes with lot of features like auto back up of your database, high IOPS etc.
But it doesnt support your Postgresql. You need to have or manage a self managed ec2 instance and run postgresql database, but make sure its fail safe and you do have proper back and restore system at place.
AWS provides auto scaling api and command line tools, pretty easy.
You dont have worry about the bandwidth issue etc, but I admit Angelo's answer too.
You can use elastic mem cache for caching your app. Use CDN if need to speed your app. RDS can manage upto 30000 IOPS, its a monster to it will do lot of work for you.
Feel free to ask me if you need any kind of help.
(Disclaimer: I am a senior devOps engineer working for an e-commerce company, use ruby on rails)
Congratulations and I hope your expectation pans out!!
This is such a difficult question to comprehensively answer given the available information. For example, is your site heavy on db reads, writes or both (and is your sharding/replication strategy in line with your db strain)? Is bandwidth an issue, etc? Obvious points would focus on making sure you have access to the appropriate hardware and that your recipies for whatever you use to provision/deploy your hardware is up to date and good to go. You can often throw hardware at a sudden spike in traffic until you can get to the root of whatever bottlenecks you discover (and yes, you will discover them at inconvenient times!)
Regarding scaling your app, you should at least:
1) Cache whatever you can. Pay attention to cache expiration, etc.
2) Be sure your DB has appropriate indexes set up (essentially, you should have an index on any field you're searching on.)
3) Watch your logs closely to identify potential long queries, N+1 queries, long view renders, etc.
4) Do things like what Shopify outlines in this post: http://www.shopify.com/technology/7535298-what-does-your-webserver-do-when-a-user-hits-refresh#axzz2O0gJDotV
5) Set up a good monitoring system (Monit, God, etc) for each layer of your stack - sudden spikes in traffic can quickly bottleneck your application in unexpected places and lead to more issues. The cascade can happen quickly.
6) Set up cron to automate all those little tasks you currently do manually...that you will probably forget about doing once you're dealing with traffic spikes.
7) Google scaling rails and you'll see tons of good info.
8) etc, etc, etc...
You can use some profiling tools (rubyperf, or something like NewRelic, etc) Whatever response you get from them is probably best to be considered as a rough baseline at best. Simple reason being that your profiling is dependent on your hardware stack which will certainly change depending on actual traffic patterns. Pretty easy to do if you have a site with one page of static content...incredibly difficult to do if you have a CMS site with a growing db and growing traffic.
Good luck!!!
I'm studying the best way to have multiple redmine instances in the same server (basically I need a database for each redmine group).
Until now I have 2 options:
Deploy a redmine instance for each group
Deploy one redmine instance with multiple database
I really don't know what is the best practice in this situation, I've seen some people doing this in both ways.
I've tested the deployment of multiple redmines (3 instances) with nginx and passenger. It worked well but I think with a lot of instances it may not be feasible. Each app needs around 100mb of RAM, and with the increasing of requests it tends to allocate more processes to the app. This scenario seems bad if we had a lot of instances.
The option 2 seems reasonable, I think I can implement that with rails environments. But I think that there are some security problems related with sessions (I think a user of site A is allowed to make actions on site B after an authentication in A).
There are any good practice for this situation? What's the best practice to take in this situation?
Other requirement related with this is: we must be able to create or shut down a redmine instance without interrupt the others (e.g. we should avoid server restarts..).
Thanks for any advice and sorry for my english!
Edit:
My solution:
I used a redmine instance for each group. I used nginx+unicorn to manage each instance independently (because passenger didn't allow me to manage each instance independently).
The two options are not so different after all. The only difference is that in option 2, you only have one copy of the code on your disk.
In any case, you still need to run different worker processes for each instance, as Redmine (and generally most Rails apps) doesn't support database switching for each request and some data regarding a certain environment are cached in process.
Given that, there is not really much incentive to share even the codebase as it would require certain monkey patches and symlink-magic to allow the proper initialization for the intentional configuration differences (database and email configuration, paths to uploaded files, ...). The Debian package does that but it's (in my eyes) rather brittle and leads to a rather non-standard system.
But to stress again: even if you share the same code on the disk between instances, you can't share the running worker processes.
Running multiple instances from the same codebase is not officially supported by Redmine. However, Debian/Ubuntu packages seem to support such approach... See:
Multiple instances of redmine on Debian squeeze
So, generally:
If you use Debian/Ubuntu go with option #2
Otherwise go with #1
Rolling forward a couple of years, and you might now want to consider a third option of using docker containers for each of your redmine instances.
I've been using https://github.com/sameersbn/docker-redmine.git , and have been quite happy with it except that it doesn't yet support handling of incoming mail for creating and commenting on tickets.
I'm new to AppFabric and I'm evaluating a distributed cache solution for a production environment and I'm in a Microsoft shop using Asp.net MVC and WebApi, and we are not using Windows Azure.
In setting up AppFabric on my local computer, there was a step to create a database or use xml, I'd like to understand the concept here. Is AppFabri depending on a datasource (db or xml files) to persist? If so, wouldn't this be a potential bottleneck?
Also, could anyone who is using AppFabric right now on their production servers comment on their experience using it? Any pitfalls or gotchas?
Thanks, really appreciate it!
I can give you my experience from a little over a year ago after deploying AppFabric in a production environment. I haven't kept up with it since because my experience wasn't great. Maybe they've fixed some of the issues we had.
The step to create a database or XML is just to store configuration information for the cluster.
My notes (remember this was current a year ago, so maybe things have changed):
When caching an object from C#, the object was turned into XML and stored. It was a verbose format and made storing/getting a little slower than it should have been. I would have rather had the object serialized to a binary format - or compressed - or just anything other than uncompressed XML. We actually modified our objects to have shorter property names when cached since they turned into XML. This caused some objects to go from 1MB down to a couple hundred kilobytes.
AppFabric was communicating over the NetTcp protocol in Windows and this caused us some grief. We had some servers without the Windows service installed (NetTcp) and that caused headaches. We couldn't figure out why AppFabric worked on one machine and not another.
It did seem to do well with distributing the load across multiple machines in the cluster. It also seemed to retrieve stuff fast and the expiration logic always worked fine.
At the time it was just a fairly immature product. We couldn't find any support anywhere for it. I remember being on the phone with Microsoft trying to track down some issues we had and every time the person would say, "AppFabric? What is that?" The community around it at the time was non-existent. (This was really painful for us.)
If I had to do an application for Windows that needed a distributed cache I would have to re-evaluate AppFabric. My first experience wasn't the greatest. Now I'd think of Redis, Couchbase, Memcached - in that order.
I need to host a lot of simple rails/sinatra/padrino applications of different ruby versions each with 0..low hits per day. They belong to different owners and should be well isolated from each other.
When an app is hit it should respond in reasonably short time, but I expect several simultaneous visitors are hitting the same site to be a rare case.
I'm going to create separate os user for each application. Surely I'd like to put them as many per server as it's possible. Thus I need to choose the web server with the lowest memory footprint, which can run applications on behalf of different users with different ruby versions and gemsets.
I consider webrick,nginx+passenger,thin,apache+passenger. I suppose the reliability of all choices is sufficient for such a job, and while performance isn't an issue, the memory consumption is.
I found many posts regarding performance issues, but most of them discuss the performance tuning and issues. I couldn't find a comparison of web servers memory usage when idle.
Is "in process" webrick the best choice? Which one would you choose for that job?
And I couldn't figure out how to resolve subdomains to application ports with webrick. Shall I use nginx or apache for that?
I don't have much experience with hosting myself, but using Webrick for production is not a good idea I think. You can also check out mongrel which I saw used in production. In most cases though you will probably want to choose between thin and unicorn. Check out this http://cmelbye.github.com/2009/10/04/thin-vs-unicorn.html or google around. Good luck :-)
Why not use Heroku? Its free and gets you out of the hassle of server configuration and maintenance.
It seems that the only tutorials out there talking about using Amazon's SimpleDB in a rails site are using AWSDBProxy... Personally, I find this counter-intuitive to scaling out, considering the server layout of a typical Rails site below (using AWSDBProxy):
Plugin here: http://agilewebdevelopment.com/plugins/aws_sdb_proxy
Image here: http://www.freeimagehosting.net/uploads/91be4e0617.png
As you can see, even if we add more mongrels, we have two problems.
We have a single point of failure far less stable than our load balancer
We have to force all our information through this one WEBrick server
The solution is, of course, to add more AWSDBProxies... but why not then just use the following code in say, a class, skipping the proxy all together?
service = AwsSdb::Service.new(Logger.new(nil),
CONFIG['aws_access_key_id'],
CONFIG['aws_secret_access_key'])
service.query(domain, query)
So what I'm getting at, is if you are using AWSDBProxy, what are you justifications for it? And if you are indeed using it, what is your performance like? If you have hard numbers, this would be even more appreciated!
I'm not using it, nor have I ever heard of it, but this is what I would think are reasonable reasons.
You're running your main app server on EC2, so the chance of Internet FAIL doesn't really affect you more than once.
You run one proxy on each of your app servers. So it's connection going down is no worse than it's connection(s) to the database going down.
Because it can be done. This is as good a reason as any in an open source project. Sometimes it takes building a thing before you know whether said thing is a good/bad idea.
You don't have the traffic levels to need a load balancer. Then your diagram squashes down to a line, if not a single machine.