Are you using AWSDBProxy? Is there a performance hit when scaling out? - ruby-on-rails

It seems that the only tutorials out there talking about using Amazon's SimpleDB in a rails site are using AWSDBProxy... Personally, I find this counter-intuitive to scaling out, considering the server layout of a typical Rails site below (using AWSDBProxy):
Plugin here: http://agilewebdevelopment.com/plugins/aws_sdb_proxy
Image here: http://www.freeimagehosting.net/uploads/91be4e0617.png
As you can see, even if we add more mongrels, we have two problems.
We have a single point of failure far less stable than our load balancer
We have to force all our information through this one WEBrick server
The solution is, of course, to add more AWSDBProxies... but why not then just use the following code in say, a class, skipping the proxy all together?
service = AwsSdb::Service.new(Logger.new(nil),
CONFIG['aws_access_key_id'],
CONFIG['aws_secret_access_key'])
service.query(domain, query)
So what I'm getting at, is if you are using AWSDBProxy, what are you justifications for it? And if you are indeed using it, what is your performance like? If you have hard numbers, this would be even more appreciated!

I'm not using it, nor have I ever heard of it, but this is what I would think are reasonable reasons.
You're running your main app server on EC2, so the chance of Internet FAIL doesn't really affect you more than once.
You run one proxy on each of your app servers. So it's connection going down is no worse than it's connection(s) to the database going down.
Because it can be done. This is as good a reason as any in an open source project. Sometimes it takes building a thing before you know whether said thing is a good/bad idea.
You don't have the traffic levels to need a load balancer. Then your diagram squashes down to a line, if not a single machine.

Related

Whats difference between fail-over and replication?

Whats difference between fail-over and replication? I tried reading from this article https://github.com/donnemartin/system-design-primer#availability-in-numbers but I could not understand the difference.
Replication is creating or maintaining multiple copies of something -- generally your database, but possibly more, such an an image of your entire server.
Failover is when one system detects that another has failed, and responds by taking over its duties.
They are completely different things, though they are often are used together to serve purposes such as fault tolerance and disaster preparedness.

Scaling to support a massive amount of traffic in a short period of time

Until now, our site has had a modest amount of traffic. None of our developers are big ops guys, but we've stayed ahead of it and keep the site up and running pretty quick. That said, our dev team is stretched, we've accumulated some technical debt, and there's plenty of opportunity to optimize.
Without getting into specifics, we just found out that we'll be expecting a massive amount of traffic in the near future in a very short period time. On the order of several million hits in a few hours. Scaling is one thing, but this is several orders of magnitude greater than what we're seeing now.
We're a Rails app hosted on S3 using ELB, and Postgresql.
I wanted to field some recommendations for broad starting points for scaling and load testing given this situation.
Update: Sorry, EC2, late night :)
#LastZactionHero
Pretty interesting question, let me answer you in detail, I hope you are talking about some e-commerce applications, enterprise or B2B apps doenst see spikes as such. Since you already mentioned that you are hosted your rails app on s3. Let me make couple of things clear.
1)You cant host an rails app on s3. S3 is simple storage service. Where you can only store files.
2) I guess you have hosted your rails app on AWS ec2 with a elastic load balancer attached above the ec2 instances which is pretty good.
3)You have a self managed Postgresql deployed on a ec2 instance.
If you are running on AWS you are half way safe and you can easily scale up and scale down.
I can see one problem in your present model, that your db. AWS has got db as a service. Thats called Relation database service.Which supports Mysql Oracle and MS SQL server.
RDS comes with lot of features like auto back up of your database, high IOPS etc.
But it doesnt support your Postgresql. You need to have or manage a self managed ec2 instance and run postgresql database, but make sure its fail safe and you do have proper back and restore system at place.
AWS provides auto scaling api and command line tools, pretty easy.
You dont have worry about the bandwidth issue etc, but I admit Angelo's answer too.
You can use elastic mem cache for caching your app. Use CDN if need to speed your app. RDS can manage upto 30000 IOPS, its a monster to it will do lot of work for you.
Feel free to ask me if you need any kind of help.
(Disclaimer: I am a senior devOps engineer working for an e-commerce company, use ruby on rails)
Congratulations and I hope your expectation pans out!!
This is such a difficult question to comprehensively answer given the available information. For example, is your site heavy on db reads, writes or both (and is your sharding/replication strategy in line with your db strain)? Is bandwidth an issue, etc? Obvious points would focus on making sure you have access to the appropriate hardware and that your recipies for whatever you use to provision/deploy your hardware is up to date and good to go. You can often throw hardware at a sudden spike in traffic until you can get to the root of whatever bottlenecks you discover (and yes, you will discover them at inconvenient times!)
Regarding scaling your app, you should at least:
1) Cache whatever you can. Pay attention to cache expiration, etc.
2) Be sure your DB has appropriate indexes set up (essentially, you should have an index on any field you're searching on.)
3) Watch your logs closely to identify potential long queries, N+1 queries, long view renders, etc.
4) Do things like what Shopify outlines in this post: http://www.shopify.com/technology/7535298-what-does-your-webserver-do-when-a-user-hits-refresh#axzz2O0gJDotV
5) Set up a good monitoring system (Monit, God, etc) for each layer of your stack - sudden spikes in traffic can quickly bottleneck your application in unexpected places and lead to more issues. The cascade can happen quickly.
6) Set up cron to automate all those little tasks you currently do manually...that you will probably forget about doing once you're dealing with traffic spikes.
7) Google scaling rails and you'll see tons of good info.
8) etc, etc, etc...
You can use some profiling tools (rubyperf, or something like NewRelic, etc) Whatever response you get from them is probably best to be considered as a rough baseline at best. Simple reason being that your profiling is dependent on your hardware stack which will certainly change depending on actual traffic patterns. Pretty easy to do if you have a site with one page of static content...incredibly difficult to do if you have a CMS site with a growing db and growing traffic.
Good luck!!!

issues with deploying rails servers to multiple hosts

I've heard often that deploying a traditional monolithic Rails app (i.e. no internal Web API, no message queue, no Redis/memcached server) to multiple servers can produce a bunch of bugs that are very hard to debug but I'm having a hard time coming up with some concrete examples despite a few hours of googling
Some obvious issues that I can think of are:
Observers - likely will not work properly as the observation is only propagated on one server and not all of them (assuming there is no Message Queue)
Sessions - would probably need to store these in the database which would need it's own host
Caches - any sweepers would have issues propagating invalidations between servers.
Anyone else care to contribute? I'd really appreciate any articles others may have come across or just general wisdom :)
Observers are just code callbacks.
They run on each process, on each server.
Sessions have defaulted to the cookie store for the last few years.
So multiple servers are no problem.
If you don't have enough space in your cookie then I suggest you may be doing something wrong.
Cache invalidation is indeed a problem.
But it always is.
One solution is to break your cache out into a standalone service.
Sites like Facebook have giant farms of memcache
I think scaling and clustering is always a hard problem.
But this seems to be an old argument against rails.
If anything the last few years have seen rails shine in this respect.
With ec2, nosql, and server automation becoming quite a norm in the community.

Best way to run rails with long delays

I'm writing a Rails web service that interacts with various pieces of hardware scattered throughout the country.
When a call is made to the web service, the Rails app then attempts to contact the appropriate piece of hardware, get the needed information, and reply to the web client. The time between the client's call and the reply may be up to 10 seconds, depending upon lots of factors.
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
I basically see two options. Either run JRuby and use multithreading or else run several regular Ruby instances and hope that not many people try to use the service at a time. JRuby seems like the much better solution, but it still doesn't seem to be mainstream and have out of the box support at Heroku and EngineYard. The multiple instance solution seems like a total kludge.
1) Am I right about my two options? Is there a better one I'm missing?
2) Is there an easy deployment option for JRuby?
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
From an engineering perspective, this seems like it would be the best alternative.
Why don't you want to do it?
There's a third option: If you host your Rails app with Passenger and enable global queueing, you can do this transparently. I have some actions that take several minutes, with no issues (caveat: some browsers may time out, but that may not be a concern for you).
If you're worried about browser timeout, or you cannot control the deployment environment, you may want to process it in the background:
User requests data
You enter request into a queue
Your web service returns a "ticket" identifier to check the progress
A background process processes the jobs in the queue
The user polls back, referencing the "ticket" id
As far as hosting in JRuby, I've deployed a couple of small internal applications using the glassfish gem, but I'm not sure how much I would trust it for customer-facing apps. Just make sure you run config.threadsafe! in production.rb. I've heard good things about Trinidad, too.
You can also run the web service call in a delayed background job so that it's not hogging up a web-server and can even be run on a separate physical box. This is also a much more scaleable approach. If you make the web call using AJAX then you can ping the server every second or two to see if your results are ready, that way your client is not held in limbo while the results are being calculated and the request does not time out.

Thin + Nginx Production ready combination for RubyOnRails Application

I have recently installed Nginx + Thin on my deployment server, but i am not sure how this will perform in last requests & responses situation. lets say 1000/req per sec.
so the speed on thin is good with 10-100 req /per sec
I wanted to know on higher volumes of data being processed on the request/response cluster.
Guide me on this :-)
Multiple thin processes and nginx are capable of providing lots of speed, depending on what your application is doing. So, the problem will be your application code, the speed of your application server, and your database server.
Scaling Rails has been recently covered in depth by the Scaling Rails Screencasts. I recommend you start there. My 5 step program to scaling Rails would be:
First step is to have the tools to look at what is slow in your application. Do not spend time optimizing everything in your application when you don't know what the problem is.
The easiest way to be able to handle lots of requests/second is with page caching.
If you can't do that, cache everything possible (fragment caching, use memcached to cache data, etc), to speed up your application.
After that, optimize your application as best as possible, make SQL queries fast, index everything, etc.
If you still need more speed, throw more hardware at the problem. Get a big, powerful database server, a bunch of app servers, and proxy your requests across them. You can start here, too, but it will only delay the optimization process.
If you have a single server I think that the main key is, apart from everything already mentioned, is don't skimp on the specs of it. Trying to get too much to run on too little is just a recipe for disaster.
It is also a good idea to get monit or God monitoring your thin instances, I started out with God, but it leaked memory pretty bad on Ruby 1.8.6 so I stop using it in favour of monit. Monit is written in C I believe and has a tiny memory footprint so I'd recommend that one.
If all that seems like a bit much to keep nginx and thin playing nicely you may want to look into an all in one solution like Passenger or LiteSpeed. I have very little experience with these so can offer no substancial advice for them.

Resources