Ruby on Rails: what performance can I realistically aim for? - ruby-on-rails

I've been building an application in Ruby on Rails 3, and I'm starting to worry about performance optimization. Now I hope that my question is not too subjective for this site, but I'm interested in facts, not a discussion, so here goes:
While I'm trying to get my views to render faster, there is one thing I simply do not know: What should I aim for? Given a reasonably complex page, what load time is realistic? I simply don't have any reference.
What I'm typically seeing for my application is something like this:
Completed 200 OK in 397ms (Views: 341.1ms | ActiveRecord: 17.7ms)
This is on my production server, running Apache/Passenger.
I am the only one (!) making requests on that server, it's a root server (not virtual), running Ubuntu, AMD Athlon 64 X2 5600+, 4 GB RAM
That is, for most of my more complicated actions (not unusually complicated, just assume it's a paginated listing of 20 objects with 5 computed properties each or something) the ActiveRecord times are almost always fine (<20-30ms), but the "views" number is usually >200 ms.
Now, to my question: When I started using RoR my expectation (maybe unrealistic) was that for most consumer-oriented applications with average complexity (let's say something like Facebook, Twitter, etc. WITHOUT the millions of users) I would get < 20 ms load times as long as I was the only one making requests, and that for a single server load times would only approach 100ms or more if there were lots of people making requests at the same time.
My expectation was also that database requests would be the major bottleneck, since all the rest is just relatively simple computations without any real complexity. I thought that it might take 10ms to get all the objects from the database, and then maybe another 5 ms to run the controller code, build the view, etc.
Since I've never been in charge of any production app, I don't know if this expectation was in any way realistic. So I would like somebody with experience point out to me what my realistic expectation should be.
(e.g. "pretty much everything but really nasty stuff should render in 50 ms tops as long as you are the only one making requests")
or ("actually 300 ms is not unusual for RoR applications, even if you're the only user")
or ("Are you kidding? I get < 10 ms with 150 concurrent requests on a smaller server than yours. There must be something very wrong with your app)
Again, I hope this is not too subjective, but I'm not really interested in an opinion of whether or not RoR is fast, I want facts from someone with more experience on what numbers are average and to be expected from production RoR applications. Otherwise I simply have no clue at what point I should stop optimizing and just accept that I'll never get 10 ms load times.

Gosh, I'm not sure I'm the one to answer this, but since I've been around these waters enough times, I may have an incomplete idea of things to look at.
First of all, the response times is pretty subjective. Meaning, it's good enough if it's good enough for you. From my experience, pages resembling your description seem to take about as much time as what you're describing. So, you're not orders of magnitude off in either direction.
If you want to optimize your view renders with your current architecture, your next step is here, I think. Greg Pollack does a great job breaking this stuff down for you and will make sure you're on track. You'll be sure to get your assets cached and your stack fine-tuned. That'll be your most practical general advice.
If you're willing to look at your deployment architecture, Ilya Grigorik raises some great questions in this article and then answers them with Goliath. If your bottlenecks are speeding up your server-client round trip, that's probably the approach to do.
I try to pay attention to anything Aaron Patterson says about performance, like in this talk. He's going to teach general optimization ideas, most of them for your server-side code. You may catch a few things that relate to your current problem.
I was pulled aside by a former co-worker at MWRC this year and told that I'm absolutely nuts if I'm not building with JRuby these days. It's a bit of a commitment, and I've resisted making major changes like that until I have truly painful response times, which I don't, and it doesn't sound like you're having either. However, JRuby's a very mainstream thing to do now, and you and I will likely embrace this for some projects at some point in the future.
So, bottom line, I think you're in the realm of a spry app as you are. I think I'd work down these resources in the order I presented them.

Not knowing what you're rendering, it's hard to comment on the performance, but I would venture to say that 200ms is very high. Don't forget that the debug information in your logs can be a little misleading: if you're querying your DB or some external resource from within a view, as opposed to preloading that data in your controller, then that time will be attributed to view rendering.
Common culprits: you load Model X in your model, but then access an association in your view which triggers a bunch of selects under the hood. The time to fetch Model x is low, but the associated records will show up as "view time".
In other words, dig into the logs and if its actually your view code, then bring up a profiler.

I'm getting view times < 20ms on a $20/month linode server. That's well-optimized code, for a request of medium complexity, running on JRuby. You haven't hit Rails' performance limits by any means. Time to use a profiler and see what's taking so long.

I don't think your 200 ms view time is abnormal, or even high in any way.
However, you have room for improvement. You say " (not unusually complicated, just assume it's a paginated listing of 20 objects with 5 computed properties each or something)"
To me, that's 100 operations that could be pre-calculated, and would speed up your view rendering time.
Finally -- Rendering time doesn't usually have a direct correlation to number of users. Under most deployments, as a request comes in, it is handled by a process and then responded to. Other requests wait until the first is completed before they are processed.

Use static content where possible. Outside of that, use caching where possible, at the highest possible level, preferably at the page level. When content can't be cached, try to get -something- static or cacheable back to the user quickly. You might, for instance, serve up a static page with the basic layout, and an animated busy-image where the content belongs, and then use JavaScript to load the dynamic content.

Related

Rails app: Trouble shooting frequent Handling RequestTimeOut errors

I have a large webb app of which I have recently been working hard to reduce load times. I have two controllers Generator (some 20.000 items) and Product (some 1.500 items) that have been slow for a while but I have worked with indexes and smart queries. On my dev app the app response time is about 500 ms.
From time to time I still get RequestTimeOut on the app and I need help trouble shooting this error. I understand what it means (a request has taken too much time) and I have installed the 'rack-timeout' gem and set it to 15 seconds (which works fine).
I have gone through the entire app (and especially the two slowest: Generator & Product) in search for time to save. I have had some issues with caching that I am currently trying to fix (caching would help quite a bit).
It seems that these timeouts happens mostly when bots (Yandex.ru especially) spiders through my site and especially goes through one generator after another. They may not be very slow any more but loading so many after another causes a lot of requests.
Now I am out of ideas and need some help in order to know what and how to continue my trouble shooting:
Is there anything else outside of response time that cause this
error? E.g. memory leakage or something? Or is it just a matter of
lots of requests on slow controllers?
I haven't been able to test it on my development platform. Is
there a way to benchmark and see how the app would handle
requests like from the bots? I seem to remember there was an
"Apache-thing" one could use to simulate traffic like this.
Any other ways of looking at the problem or trouble shoot this
issue from a high level point of view? Any ideas and
thoughts are welcome!

Heroku database performance experience needed?

We are experiencing some serious scaling challenges for our intelligent search engine/aggregator. Our database holds around 200k objects. From profiling and newrelic it seems most of our troubles may come from the database. We are using the smallest dedicated database Heroku provide (Ronin).
We have been looking into indexing and caching. So far we managed to solve our problems by reducing database calls and caching content intelligently, but now even this seems to reach an end. We are constantly asking ourselves if our code/configuration is good enough or if we are simply not using enough "hardware".
We suspect that the database solution we buy from Heroku may be performing insufficiently. For example, just doing a simple count (no joins, no nothing) on the 200k items takes around 250ms. This seems like a long time, even though postgres is known for its bad performance on counts?
We have also started to use geolocation lookups based on latitude/longitude. Both columns are indexed floats. Doing a distance calculation involves pretty complicated math, but we are using the very well recommended geocoder gem that is suspected to run very optimized queries. Even geocoder still takes 4-10 seconds to perform a lookup on, say, 40.000 objects, returning only a limit of the first nearest 10. This again sounds like a long time, and all the experienced people we consult says that it sound very odd, again hinting at the database performance.
So basically we wonder: What can we expect from the database? Might there be a problem? And what can we expect if we decide to upgrade?
An additional question I have is: I read here that we can improve performance by loading the entire database into memory. Are we supposed to configure this ourselves and if so how?
UPDATE ON THE LAST QUESTION:
I got this from the helpful people at Heroku support:
"What this means is having enough memory (a large enough dedicated
database) to store your hot data set in memory. This isn't something
you have to do manually, Postgres is configured automatically use all
available memory on our dedicated databases.
I took a look at your database and it looks like you're currently
using about 1.25 GB of RAM, so you haven't maxed your memory usage
yet."
UPDATE ON THE NUMBERS AND FIGURES
Okay so now I've had time to look into the numbers and figures, and I'll try to answer the questions below as follows:
First of all, the db consists of around 29 tables with a lot of relations. But in reality most queries are done on a single table (some additional resources are joined in, to provide all needed information for the views).
The table has 130 columns.
Currently it holds around 200k records but only 70k are active - hence all indexes are made as partial-indexes on this "state".
All columns we search are indexed correctly and none is of text-type, and many are just booleans.
Answers to questions:
Hmm the baseline performance it's kind of hard to tell, we have sooo many different selects. The time it takes varies typically from 90ms to 250ms selecting a limit of 20 rows. We have a LOT of counts on the same table all varying from 250ms to 800ms.
Hmm well, that's hard to say cause they wont give it a shot.
We have around 8-10 users/clients running requests at the same time.
Our query load: In new relic's database reports it says this about the last 24 hours: throughput: 9.0 cpm, total time: 0.234 s, avg time: 25.9 ms
Yes we have examined the query plans of our long-running queries. The count queries are especially slow, often over 500ms for a pretty simple count on the 70k records done on indexed columns with a result around 300
I've tuned a few Rails apps hosted on Heroku, and also hosted on other platforms, and usually the problems fall into a few basic categories:
Doing too much in ruby that could be done at the db level (sorting, filtering, join data, etc)
Slow queries
Inefficient use of indexes (not enough, or too many)
Trying too hard to do it all in the db (this is not as common in rails, but does happen)
Not optimizing cacheable data
Not effectively using background processing
Right now its hard to help you because your question doesn't contain any specifics. I think you'll get a better response if you pinpoint the biggest issue you need help with and then ask.
Some info that will help us help you:
What is the average response time of your actions? (from new relic, request-log-analyzer, logs)
What is the slowest request that you want help with?
What are the queries and code in that request?
Is the site's performance different when you run it locally vs. heroku?
In the end I think you'll find that it is not an issue specific to Heroku, and if you had your app deployed on amazon, engineyard, etc you'd have the same performance. The good news is I think that your problems are common, and shouldn't be too hard to fix once you've done some benchmarking and profiling.
-John McCaffrey
We are constantly asking...
...this seems a lot...
...that is suspected...
...What can we expect...
Good news! You can put and end to seeming, suspecting wondering and expecting through the magic of measurement!!!
Seriously though, you've not mentioned any of the basic points you'd need to get a useful answer:
What's the baseline performance of the DB running a sequential scan and single-row index fetches? You say Heroku say your DB fits in RAM, so you shouldn't see disk I/O issues when you measure.
Does this performance match whatver Heroku say it should be?
How many concurrent clients?
What's your query load - what queries and how often?
Have you checked the query plans for any of your suspiciously long-running queries?
Once you've got this sort of information, maybe someone can say something useful. As it stands anything you read here is just guesswork.
First: you should check your postgres configuration. (show all from within psql or another client, or just look at postgres.conf in the data directory) The parameter with the largest impact on performance is effective_cache_size, which should be set to about (total_physical_ram - memory_in_use_by_kernel_and_all_processes). For a 4GB machine, this often is around 3GB (4-1). (this is very course tuning, but will give the best results for a first step)
Second: why do you want all the counts? Better use a typical query: just ask for what is needed, not what is available. (reason: there is no possible optimisation for a COUNT(*): eiither the whole table, or a whole index needs to be scanned)
Third: start gathering and analysing some queryplans (for typical queries that perform badly). You can get a query plan by putting EXPLAIN ANALYZE before the actual query. (another way is to increase the logging level, and obtain them from the logfile) A bad queryplan can point you at missing statistics or indexes, or even at bad data-modelling.
Newrelic monitoring can be included as an add-on for heroku (http://devcenter.heroku.com/articles/newrelic). At the very least this should give you a lot of insight into what is happening behind the scenes, and may help you pinpoint some issues.

When/what to cache in Rails 3

Caching is something that I kind of ignored for a long time, as projects that I worked on were on local intranets with very little activity. I'm working on a much larger Rails 3 personal project now, and I'm trying to work out what and when I should cache things.
How do people generally determine this?
If I know a site is going to be relatively low-activity, should I just cache every single page?
If I have a page that calls several partials, is it better to do fragment caching in those partials, or page caching on those partials?
The Ruby on Rails guides did a fine job of explaining how caching in Rails 3 works, but I'm having trouble understanding the decision-making process associated with it.
Don't ever cache for the sake of it, cache because there's a need (with the exception of something like the homepage, which you know is going to be super popular.) Launch the site, and either parse your logs or use something like NewRelic to see what's slow. From there, you can work out what's worth caching.
Generally though, if something takes 500ms to complete, you should cache, and if it's over 1 second, you're probably doing too much in the request, and you should farm whatever you're doing to a background process…for example, fetching a Twitter feed, or manipulating images.
EDIT: See apneadiving's answer too, he links to some great screencasts (albeit based on Rails 2, but the theory is the same.)
You'll want to think about caching several kinds of things:
Requests that are hit a lot, and seldom change
Requests that are "expensive" to draw, lots of database calls, etc. Also hopefully these seldom change.
The other side of caching that shouldn't go without mention, is expiration. Its also often the harder part. You have to know when a cache is no longer good, and clear it out so fresh content will be generated. Sweepers, or Observers, depending on how you implement your cache can help you with this. You could also do it just based on a time value, allow caches to have a max-age and clear them after that no matter what.
As for fragment vs full page caching, think of it in terms of how often those parts are updated. If 3 partials of a page are never updated, and one is, maybe you want to cache those 3, and allow that 1 to be fetched live for so you can have up to the second accuracy. Or if the different partials of a page should have different caching rules: maybe a "timeline" section is cached, but has a cache-age of 1 minute. While the "friends" partial is cached for 12 hours.
Hope this helps!
If the site is relatively low activity you shouldn't cache any page. You cache because of performance problems, and performance problems come about because you have too much data to query, too many users, or worse, both of those situations at the same time.
Before you even think about caching, the first thing you do is look through your application for the requests that are taking up the most time. Not the slowest requests, but the requests your application spends the most aggregate time performing. That is if you have a request A that runs 10 times at 1500ms and request B that runs 5000 times at 250ms you work on optimizing B first.
It's actually pretty easy to grep through your production.log and extract rendering times and URLs to combine them into a simple report. You can even do that in real-time if you want.
Once you've identified a problematic request, you go about picking apart what it's doing to service the request. The first thing is to look for any queries that can be combined by using eager loading or by looking ahead a bit more to anticipate what you'll need. The next thing is to ensure you're not loading data that isn't used.
So many times you'll see code to list users and it's loading 50KB per person of biographical data, their Facebook and Twitter handles, literally everything about them, and all you use is their name.
Fetch as little as you need, and fetch it in the most efficient way you can. Use connection.select_rows when you don't need models.
The next step is to look at what kind of queries you're running, and how they're under-performing. Ensure your indexes are all set properly and are being used. Check that you're not doing complicated JOIN operations that could be resolved by a bit of tactical de-normalization.
Have a look at what data you are storing in your application, and try and find things that can be removed from your production database and warehoused somewhere else. Cycle your data out regularly when it's no longer relevant, preserve it in a separate database if you need to.
Then go over and have a look at how your database server is tuned. Does it have sufficiently large buffers? Is it on hardware that could be upgraded with more memory at a nominal cost? Too many people are running a completely un-tuned database server and with a few simple settings they can get ten-fold performance increases.
If, and only if, you still have a performance problem at this point then you might want to consider caching.
You know why you don't cache first? It's because once you cache something, that cached data is immediately stale. If parts of your application use this data under the assumption it's always up to date, you will have problems. If you don't expire this cache when the data does change, you will have problems. If you cache the data and never use it again, you're just clogging up your cache and you will have problems. Basically you'll have lots of problems when you use caching, so it's often a last resort.

What's the reasonable time for generating web page?

I'm working on web app (Rails 3 based). And I really don't like the time it takes to generate the page - depending on the displayed data it takes up to 2.5 and even 4 seconds.
So I just was wondering what is the average reasonable time for generating page in your apps. Saying you check the generation time, e.g. it's 750ms and think "Ok, that should be fine even without caching". Or when you see 1.5sec you think "Oh my God, the user won't wait so long and leave the site"
There's a huge amount of research data regarding the time from query to rendering and user's experience. I'd recommend reading this useit.com article. After all Google integrated page speed in its results for a reason ;)
The 3 response-time limits are the
same today as when I wrote about them
in 1993 (based on 40-year-old research
by human factors pioneers):
0.1 seconds gives the feeling of instantaneous response — that is, the
outcome feels like it was caused by
the user, not the computer. This level
of responsiveness is essential to
support the feeling of direct
manipulation (direct manipulation is
one of the key GUI techniques to
increase user engagement and control —
for more about it, see our Principles
of Interface Design seminar).
1 second keeps the user's flow of thought seamless. Users can sense a
delay, and thus know the computer is
generating the outcome, but they still
feel in control of the overall
experience and that they're moving
freely rather than waiting on the
computer. This degree of
responsiveness is needed for good
navigation.
10 seconds keeps the user's attention. From 1–10 seconds, users
definitely feel at the mercy of the
computer and wish it was faster, but
they can handle it. After 10 seconds,
they start thinking about other
things, making it harder to get their
brains back on track once the computer
finally does respond.
A 10-second delay will often make
users leave a site immediately. And
even if they stay, it's harder for
them to understand what's going on,
making it less likely that they'll
succeed in any difficult tasks.
As a rule of thumb, think that you always should aim for a balance of optimization time vs time gained. Don't spend days optimizing the hell out of one routine when your images aren't compressed correctly, or your scripts/css not combined. Yes, faster is better, but a 90% gain in generating the page by setting up a smart cache beats a 10% gain after one week tweaking the algorithm.
Also don't look too much into the first-render-time when the framework has to load everything, but use stress-testing, cached or not, to simulate various situations.
Now, some data; some of the latest sites i worked on used DotNetNuke, a huge open-source CMS, and Asp.Net MVC where you nearer to the metal. Average page time with average db queries was 600-700 milliseconds for DotNetNuke. For Asp.net MVC, it's 70-100 milliseconds... Users really like the second one :)
There's no 'right' answer to this - the faster the better. Personally I normally aim for < 200ms, although I know from experience that it can be quite difficult to achieve this in Rails on anything but simple apps. Try and figure out where your bottlenecks are and cache what you can.
Edit: There seems to be some confusion between page generation time and page render time. Obviously a quick page render is the goal, and on most sites doing things like reducing HTTP requests, gzipping CSS/JS are where you can get most of your quick wins. But if the page itself can take 4-5 seconds to generate, then you're probably right that your app is where you should start.
It depends on whether nothing is displayed for 2.5-4 seconds, or that the user already sees (a part of) the page from the start, and it finishes loading completely after 2.5-4 seconds. In that case the user doesn't experience a 2.5-4 second load. Take the http://www.nytimes.com/ website; I see most of it right away, but according to the Web Inspector it takes 1.94 seconds for it to be loaded completely.
And keep in mind that the speed will also depend on the browser, computer, internet connection. What's fast for you might be slower for others.
Measure your apdex score and see how it is performing. That will give you a rough indiciation. From there, you can decide how you want to increase performance.
It also depends on what your site is; an system application for a business or software as a service (SaaS)? If it's a system application, the users are forced to use it to performance can be negotiated. If it is a SaaS, then the higher your apdex score, the more chance you have of losing your user's interest.
There are a few gems out there that measure performance and report on what your apdex is.
Here's a little more info: http://apdex.org/blog/?p=630
My personal rule - no page should take more than 0.05 seconds, or you are in troubles.
As long as you write proper code, you don't need to spend much time on optimization to stay under 0.05.
If you stick to giant frameworks, then you are out of luck.

How to prepare to be tech crunched

There is a good chance that we will be tech crunched in the next few days. Unfortunately, we have not gone live yet so we don't have a good estimation of how our system handles a production audience.
Our production setup consists of 2 EngineYard slices each with 3 mongrel instances, using Postgres as the database server.
Obviously a huge portion of how our app will hold up is to do with our actual code and queries etc. However, it would be good to see if there are any tips/pointers on what kind of load to expect or experiences from people who have been through it. Does 6 mongrel instances (possibly 8 if the servers can take it) sound like it will handle the load, or are at least most of it?
I have worked on several rails applications that experienced high load due to viral growth on Facebook.
Your mongrel count should be based on several factors. If your mongrels make API calls or deliver email and must wait for responses, then you should run as many as possible. Otherwise, try to maintain one mongrel per CPU core, with maybe a couple extra left over.
Make sure your server is using a Fair Proxy Balancer (not round robin). Here is the nginx module that does this: http://github.com/gnosek/nginx-upstream-fair/tree/master
And here are some other tips on improving and benchmarking your application performance to handle the load:
ActiveRecord
The most common problem Rails applications face is poor usage of ActiveRecord objects. It can be quite easy to make 100's of queries when only one is necessary. The easiest way to determine if this could be a problem with your application is to set up New Relic. After making a request to each major page on your site, take a look at the newrelic SQL overview. If you see a large number of very similar queries sequentially (select * from posts where id = 1, select * from posts where id = 2, select * from posts...) this may be a sign that you need to use a :include in one of your ActiveRecord calls.
Some other basic ActiveRecord tips (These are just the ones I can think of off the top of my head):
If you're not doing it already, make sure to correctly use indexes on your database tables.
Avoid making database calls in views, especially partials, it can be very easy to lose track of how much you are making database queries in views. Push all queries and calculations into your models or controllers.
Avoid making queries in iterators. Usually this can be done by using an :include.
Avoid having rails build ActiveRecord objects for large datasets as much as possible. When you make a call like Post.find(:all).size, a new class is instantiated for every Post in your database (and it could be a large query too). In this case you would want to use Post.count(:all), which will make a single fast query and return an integer without instantiating any objects.
Associations like User..has_many :objects create both a user.objects and user.object_ids method. The latter skips instantiation of ActiveRecord objects and can be much faster. Especially when dealing with large numbers of objects this is a good way to speed things up.
Learn and use named_scope whenever possible. It will help you keep your code tiny and makes it much easier to have efficient queries.
External APIs & ActionMailer
As much as you can, do not make API calls to external services while handling a request. Your server will stop executing code until a response is received. Not only will this add to load times, but your mongrel will not be able to handle new requests.
If you absolutely must make external calls during a request, you will need to run as many mongrels as possible since you may run into a situation where many of them are waiting for an API response and not doing anything else. (This is a very common problem when building Facebook applications)
The same applies to sending emails in some cases. If you expect many users to sign up in a short period of time, be sure to benchmark the time it takes for ActionMailer to deliver a message. If it's not almost instantaneous then you should consider storing emails in your database an using a separate script to deliver them.
Tools like BackgroundRB have been created to solve this problem.
Caching
Here's a good guide on the different methods of caching in rails.
Benchmarking (Locating performance problems)
If you suspect a method may be slow, try benchmarking it in console. Here's an example:
>> Benchmark.measure { User.find(4).pending_invitations }
=> #<Benchmark::Tms:0x77934b4 #cutime=0.0, #label="", #total=0.0, #stime=0.0, #real=0.00199985504150391, #utime=0.0, #cstime=0.0>
Keep track of methods that are slow in your application. Those are the ones you want to avoid executing frequently. In some cases only the first call will be slow since Rails has a query cache. You can also cache the method yourself using Memoization.
NewRelic will also provide a nice overview of how long methods and SQL calls take to execute.
Good luck!
Look into some load testing software like WEBLoad or if you have money, Quick Test Pro. This will help give you some idea. WEBLoad might be the best test in your situation.
You can generate thousands of virtual nodes hitting your site and you can inspect the performance of your servers from that load.
In my experience having watched some of our customers absorb a crunching, the traffic was fairly modest- not the bone crushing spike people seem to expect. Now, if you get syndicated and make on Yahoo's page or something, things may be different.
Search for the experiences of Facestat.com if you want to read about how they handled it (the Yahoo FP.)
My advise is just be prepared to turn off signups or go to a more static version of your site if your servers get too hot. Using a monitoring/profiling tool is a good idea as well, I like FiveRuns Manage tool for ease of setup.
Since you're using EngineYard, you should be able to allocate more machines to handle the load if necessary
Your big problems will probably not be the number of incoming requests, but will be the amount of data in your database showing you where your queries aren't using the indexes your expecting, or are returning too much data, e.g. The User List page works with 10 users, but dies when you try to show 10,000 users on that one page because you didn't add pagination (will_paginate plugin is almost your friend - watch out for 'select count(*)' queries that are generated for you)
So the two things to watch:
Missing indexes
Too much data per page
For #1, there's a plugin that runs an 'explain ...' query after every query so you can check index usage manually
There is a plugin that can generate data for you for various types of data that may help you fill your database up to test these queries too.
For #2, use will_paginate plugin or some other way to reduce data per page.
We've got basically the same setup as you, 2 prod slices and a staging slice at EY. We found ab to be a great load testing tool - just write a bash script with the urls that you expect to get hit and point it at your slice. Watch NewRelic stats and it should give you some idea of the load your app can handle and where you might need to optimise.
We also found query_reviewer to be very useful as well. It is great for finding those un-indexed tables and n+1 queries.

Resources