Could performance issues imerge when using ActionCable in Production? - ruby-on-rails

I'm planning to have a Rails App that has a very content rich interactive page where many users will connect to.
Development has went well and small time testing on the Dev servers went without a hitch either.
Problems started when we started alpha testing with selected groups of people. the sever would grind to a halt suddenly. Nginx would stop because of queue being full. I was at a lose for a while, but after looking around, came to the conclusion that the live actioncable was completely eating up my memory.This especially gets bad when the user reloads the page multiple times that subscribes to actioncable, causing additional process to become active, completely stopping the server, only being cured by a nginx reboot.
I currently run a 2core 1GB memory SSD run VPS server for alpha testing, perhaps at tops 20 concurrent users.Should I be running into performance problems with such load? or should tuning the code or redis, passenger fix this?
I know its hard to say any definitive things without more specifics, but could a ballpark estimate be done with the information?

After some gogoling and testing Nginx settings, adding this directive to the nginx settings for passenger has seemed to dramatically improve the performance issue.
location /special_websocket_endpoint {
passenger_app_group_name foo_websocket;
passenger_force_max_concurrent_requests_per_process 0;
more info here

20 concurrent users plus multiple tabs per user is still less than about 100 concurrent websocket connections, it is not that lot.
First thing I'd look for is leaks - when for some reason websocket connection or other resources (open files etc.) does not get freed when actual user disconnects. Make sure you're running fresh versions of rails/passenger, as there was a bug in rails causing similar behaviour (see for details)
Also while actioncable+passenger inside nginx allows you to run everything inside single process, it is not a good idea when you expect some load.
When running a clean nginx and separate rails servers for regular requests and cable - at least other parts of the app will continue some kind of working in such conditions.


ActionCable slow in production

I am building a basic chat application for customer support of a website. It works flawlessly in development on local server. I pushed the changes to the server but it is behaving extremely slow. The application in itself works fast but the pub/sub to acioncable channels is slow.
I am using nginx, puma for webserver and redis for pub/sub. I have four channels and two of them have heavy client side (coffee.erb file).How can I reduce time for actioncable channels? How can I debug what is causing the lag?
Thanking you in advance. If any code is required please mention in the comments of the question and I would add it to the question.
The most common cause for things running way more slowly on a server than locally is because they don't have nearly the same amount of RAM and start swapping. Just like an app being ridiculously slow on older phones.
In one case, the system swaps in and out memory, in the other case, the app swaps in and out resources itself (often implicitly through resource caches provided by the API).
The effect is the same: massive I/O overhead that doesn't exist on your development system / modern phone, leading to runtime behavior that is slower by several orders of magnitude.

Thin vs Unicorn on Heroku

Just wanted to get people's opinions on using Unicorn vs Thin as a rails server. Most of the articles/benchmarks I found online seem very incomplete, so it would nice to have a centralized place to discuss it.
Unicron is a multi-processes server, while thin is an event based/non-blocking server. Event-based servers are great... if your code is asynchronous/non-blocking - vanilla rails is blocking. So unless you use non-blocking rails libraries, I really don't see the advantage of using Thin. Even worse, in a non-blocking server, if your i/o loop is blocking you're going to block the entire loop and not be able to handle any more requests until the blocking call returns. Blocking libraries are going to slow thin down!
Why did Heroku choose Thin as their default server (for cedar)? They are smart guys, so I'm sure they had a reason.
Bellow is a link that suggests replacing Thin with 4 Unicorn workers - this makes perfect sense to me.
4 Unicron workers on Heroku
Thin is easy to configure - not optimal, but it just works in the Heroku environment.
Unicorn can be more efficient, but it needs to be configured: How many workers? Preload App? What do you pick?
I have released Unicorn Heroku apps with workers set to 3, 5 and 8 - just based on how big each app is - how much code, how much memory is used and how much traffic you get all go into picking this number, and you need to monitor over time to make sure you got the number right, and your app isn't running out of memory.
Preload false - this will make your app start slower, but when Unicorn restarts a worker, this is 'safer' with network connections (memcache, postgres, mongo etc)
Preload true - this is better, but you need to handle server re-connections correctly in the pre and post fork code.
Thin has none of these issues out of the box, but you only get process of execution.
Summary: It's really hard to configure Unicorn out of the box to work well (or at all) for everyone, whereas Thin can just work to get people running with fewer support requests.
Recently (only a few months ago) the folks behind Phusion Passenger add support to Heroku. Definitely this is an alternative you should try and see if fits your needs.
Is blazing fast even with 1 dyno and the drop in response time is palpable.
A simple Passenger Ruby Heroku Demo is hosted on github.
The main benefits that Passengers on Heroku claims are:
Static asset acceleration through Nginx - Don't let your Ruby app serve static assets, let Nginx do it for you and offload your app for the really important tasks. Nginx will do a much better job.
Multiple worker processes - Instead of running only one worker on a dyno, Phusion Passenger runs multiple worker on a single dyno, thus utilizing its resources to its fullest and giving you more bang for the buck. This approach is similar to Unicorn's. But unlike Unicorn, Phusion Passenger dynamically scales the number of worker processes based on current traffic, thus freeing up resources when they're not necessary.
Memory optimizations - Phusion Passenger uses less memory than Thin and Unicorn. It also supports copy-on-write virtual memory in combination with code preloading, thus making your app use even less memory when run on Ruby 2.0.
Request/response buffering - The included Nginx buffers requests and responses, thus protecting your app against slow clients (e.g. mobile devices on mobile networks) and improving performance.
Out-of-band garbage collection - Ruby's garbage collector is slow, but why bother your visitors with long response times? Fix this by running garbage collection outside of the normal request-response cycle! This concept, first introduced by Unicorn, has been improved upon: Phusion Passenger ensures that only one request at the same time is running out-of-band garbage collection, thus eliminating all the problems Unicorn's out-of-band garbage collection has.
JRuby support - Unicorn's a better choice than Thin, but it doesn't support JRuby. Phusion Passenger does.
Hope this helps.
Heroku does not use intelligent routing - it will randomly assign jobs to dynos regardless of whether the dyno is busy. Thus, if your dyno cannot handle multiple jobs at once, you will get latency (perhaps massive latency) even if you are paying for lots of other dynos that are free. " That's right — if your app needs 80 dynos with an intelligent router, it needs 4,000 with a random router. "
Heroku says they are working on this, and their plan is to make it easier to use Unicorn. They basically said "Oops, we didn't notice that this was a problem for a few years... and now that we look, it's definitely a problem for Thin... so I guess you need to use a different program than the one we've been pushing all this time."
From the official Heroku explanation (second link above):
"Rails, in fact, does not yet reliably support concurrent request handling. This leaves Rails developers unable to leverage the additional concurrency capabilities offered by the Cedar stack, unless they move to a concurrent web server like Puma or Unicorn.
Rails apps deployed to Cedar with Thin can rather quickly end up with request queuing problems. Because the Cedar router no longer does any queuing on behalf of the app, requests queued at the dyno must wait until the single Rails process works its way through the queue. Many customers have run into this issue and we failed to take action and provide them with a better approach to deploying Rails apps on Cedar."
Also of interest is that their performance tools, including New Relic, have not been reporting time spent in the dyno queue.

Is there some sort of mechanism in Phusion Passenger to keep whole applications from going down?

The following is beginning to become a huge problem for us.
We have about 15 Rails applications for our enterprise, running on a massive server. The problem occurs when two or three applications are wildly popular and they start taking up all the instances in the PassengerMaxPoolSize. As soon as that happens, other applications start losing instances, causing several apps to be down completely at any given instant. The mechanism we need is the following:
PassengerMinInstancesPerApp 1
That's it.
BUT, passenger doesn't have this, so we've tried all sorts of variations on PassengerMaxPoolSize, PassengerMaxRequests, PassengerMaxInstancesPerApp, PassengerPoolIdleTime, and PassengerUseGlobalQueue.
Here are the issues with our configuration:
1: PassengerMaxPoolSize is set to about 38...any higher and for some strange reason the other 200 regular http sites start to crawl.
2: PassengerMaxRequests is set to 1000, but for applications that are used only once or twice a week, they still get swamped and killed by other more popular apps
3: PassengerPoolIdleTime is set to 0 because we have no reason to want to shut down applications unnecessarily.
4: PassengerGlobalQueue is on to allow for slightly better load balancing.
5: PassengerMaxInstancesPerApp WAS set, and should have worked, but for some reason it caused a huge lag, similar to the PasengerMaxPoolSize problem...this COULD solve the problem, but it appears to not work...
Unfortunately getting another server is not an option, (one might imagine moving the more popular apps to a separate box).
Anyone know if Phusion is planning to make a PassengerMinInstancesPerApp parameter? Or if they plan to install a mechanism that will prohibit a given application from being completely killed? (Or, if you have any other suggestions, I'm open to possible solutions.
Yes. Coming in Passenger 3.

Thin + Nginx Production ready combination for RubyOnRails Application

I have recently installed Nginx + Thin on my deployment server, but i am not sure how this will perform in last requests & responses situation. lets say 1000/req per sec.
so the speed on thin is good with 10-100 req /per sec
I wanted to know on higher volumes of data being processed on the request/response cluster.
Guide me on this :-)
Multiple thin processes and nginx are capable of providing lots of speed, depending on what your application is doing. So, the problem will be your application code, the speed of your application server, and your database server.
Scaling Rails has been recently covered in depth by the Scaling Rails Screencasts. I recommend you start there. My 5 step program to scaling Rails would be:
First step is to have the tools to look at what is slow in your application. Do not spend time optimizing everything in your application when you don't know what the problem is.
The easiest way to be able to handle lots of requests/second is with page caching.
If you can't do that, cache everything possible (fragment caching, use memcached to cache data, etc), to speed up your application.
After that, optimize your application as best as possible, make SQL queries fast, index everything, etc.
If you still need more speed, throw more hardware at the problem. Get a big, powerful database server, a bunch of app servers, and proxy your requests across them. You can start here, too, but it will only delay the optimization process.
If you have a single server I think that the main key is, apart from everything already mentioned, is don't skimp on the specs of it. Trying to get too much to run on too little is just a recipe for disaster.
It is also a good idea to get monit or God monitoring your thin instances, I started out with God, but it leaked memory pretty bad on Ruby 1.8.6 so I stop using it in favour of monit. Monit is written in C I believe and has a tiny memory footprint so I'd recommend that one.
If all that seems like a bit much to keep nginx and thin playing nicely you may want to look into an all in one solution like Passenger or LiteSpeed. I have very little experience with these so can offer no substancial advice for them.

How can I find out why my app is slow?

I have a simple Rails app deployed on a 500 MB Slicehost VPN. I'm the only one who uses the app. When I run it on my laptop, it's fast enough. But the deployed version is insanely slow. It take 6 to 10 seconds to load the login screen.
I would like to find out why it's so slow. Is it my code? (Don't think so because it's much faster locally, but maybe.) Is it Slicehost's server being overloaded? Is it the Internet?
Can someone suggest a technique or set of steps I can take to help narrow down the cause of this problem?
Sorry forgot to mention. I'm running it under CentOS 5 using Phusion Passenger (AKA mod_rails or mod_rack).
If it is just slow on the first time you load it is probably because of passenger killing the process due to inactivity. I don't remember all the details but I do recall reading people who used cron jobs to keep at least one process alive to avoid this lag that can occur with passenger needed to reload the environment.
Edit: more details here
Specifically - pool idle time defaults to 2 minutes which means after two minutes of idling passenger would have to reload the environment to serve the next request.
First, find out if there's a particularly slow response from the server. Use Firefox and the Firebug plugin to see how long each component (including JavaScript and graphics) takes to download. Assuming the main page itself is what is taking all the time, you can start profiling the application. You'll need to find a good profiler, and as I don't actually work in Ruby on Rails, I can't suggest any: google "profile ruby on rails" for some options.
As YenTheFirst points out, the server software and config you're using may contribute to a slowdown, but A) slicehost doesn't choose that, you do, as Slicehost just provides very raw server "slices" that you can treat as dedicated machines. B) you're unlikely to see a script that runs instantly suddenly take 6 seconds just because it's running as CGI. Something else must be going on. Check how much RAM you're using: have you gone into swap? Is the login slow only the first time it's hit indicating some startup issue, or is it always that slow? Is static content served slow? That'd tend to mean some network issue (either on the Slicehost side, or your local network) is slowing things down, assuming you're not in swap.
When you say "fast enough" you're being vague: does the laptop version take 1 second to the Slicehost 6? That wouldn't be entirely surprising, if the laptop is decent: after all, the reason slices are cheap is because they're a fraction of a full server. You're using probably 1/32 of an 8 core machine at Slicehost, as opposed to both cores of a modern laptop. The Slicehost cores are quick, but your laptop could be a screamer compared to 1/4 of core. :)
Try to pint point where the slowness lies
1/ application is slow, or infrastructure (network + web server)
put a static file on your web server, and access it through your browser
2/ If it is fast, it is probable a problem with application + server configuration.
database access is slow
try a page with a simpel loop: is it slow?
3/ If it slow, it is probably your infrastructure. You can check:
bad network connection: do a packet capture (with Wireshark for example) and look for retransmissions, duplicate packets, etc.
DNS resolution is slow?
server is misconfigured?
What is Slicehost using to serve it?
Fast options are things like: Mongrel, or apache's mod_rails (also called passenger phusion or
something like that)
These are dedicated servers (or plugins to servers) which run an instance of your rails app.
If your host isn't using that, then it's probably defaulting to CGI. Rails comes with a simple CGI script that will serve the page, but it reloads the app for every page.
(edit: I suspect that this is the most likely case, that your app is running off of the CGI in /webapp_directory/public/dispatch.cgi, which would explain the slowness. This tends to be a default deployment on many hosts, since it doesn't require extra configuration on their part, but it doesn't give good performance)
If your host supports "Fast CGI", rails supports that too. Fast CGI will open a CGI session, and keep it open for multiple pages, so you get much better performance, but it's not nearly as good as Mongrel or mod_rails.
Secondly, is it in 'production' or 'development' mode? The easy way to tell is to go to a page in your app that gives an error. If it shows you a stack trace, it's in development mode, which is slower than production mode. Mongrel and mod_rails have startup options to determine whether to run the app in production or development mode.
Finally, if your database is slow for whatever reason, that will be a big bottleneck as well. If you do have a good deployment (Mongrel/mod_rails/etc.) in production mode, try looking into that.
Do you have a lot of data in your DB? I would double check that you have indexed all the appropriate columns- because this can make a huge difference. On your local dev system, you probably have a lot more memory than on your 500 mb slice, which would result in the DB running a lot slower if you have big, un indexed tables. You can also run the slow queries logger in MySql to pinpoint columns without indexes.
Other than that, yes- passenger will need to spool up a process for you if you have not been using the site recently. If this is the case, you should see a significant speed increase on second, and especially third and later page loads.
You might want to run a local virtual machine with 500 MB. Are you doing a lot of client-server interaction? Delays over the WAN are significant
You might want to check out RPM (there's a free "lite" version too) and/or New Relic's Tune Up.
Your CPU time is guaranteed by Slicehost using the Xen virtualization system, so it's not that. Don't have the other answers for you, sorry! Might try 'top' on a console while you're trying to access the page.
If you are using FireFox and doing localhost testing (or maybe even on LAN) you may want to try editing the network.dns.disableIPv6 setting.
Type about:config in the address bar and filter for network.dns.disableIPv6 and double-click to set to true.
This bug has been reported mainly from Vista OS's, but some others as well.
You could try running 'top' when you SSH in to see which process is heavy. If you also have problems logging you, perhaps you may try getting Statistics in the Slicehost manager.
If you discover it is MySQL's fault, consider decreasing the number of servers it can spawn.
512 seems decent for Rails application, you might have to check if you misconfigured too.
